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obody needs to be 
told that UNIX is 
suddenly popular 
today. In this arti- 
cle we will show 
2 you a little of where 
UNIX was yester- 
mtAm ^ day and has been 

over the past decade. And, without 
meaning in the least to minimize the in- 
credible contributions of Ken Thomp- 
son and Dennis Ritchie, we will bring to 
light many of the others who worked on 
early UNIX versions, and try to show 
where some of the key ideas came from, 
and how they got into the UNIXes of 
today. 

Our title says we are talking about 
UNIX evolution. Evolution? That 
means different things to different peo- 
ple. We use the term — in a way that 
might make the biologist in one of us 
blush — loosely, to describe the changes 
over time among the many different 
UNIX variants in use both inside and 
outside Bell Labs. Ideas, code, and use- 
ful programs seem to have made their 
way back and forth— like mutant 
genes — among all the many UNIXes 
that have been living in the phone com- 


pany over the last decade. 

Part 1 looks at some of the major 
components of the current UNIX sys- 
tem — the text formatting tools, the 
compilers and program development 
tools, and so on. Most of the work de- 
scribed in Part 1 took place at “Re- 
search”- — a part of Bell Laboratories 
(now AT&T Bell Laboratories; then as 
now “the Labs”) and the ancestral 
home of UNIX. In the next part, we 
look at some of the myriad versions of 
UNIX — there are far more than one 
might suspect. This includes a look at 
Columbus and USG UNIXes and the 
Berkeley UNIXes. Columbus, or 
CBUNIX, is the UNIX from Bell Labs 
at Columbus, Ohio. USG is the UNIX 
support group; we are loosely including 
PWB (Programmer’s Workbench, and 
System II and System V, as USG 
UNIXes. Berkeley, of course, is the 
University of California at Berkeley. 
You’ll begin to get a glimpse inside the 
history of the UNIX verse. 

Basic sources 

Since we can’t say everything about 
UNIX in this article, we’ll give some 
pointers for those who want to read 
more. Full acknowledgements will be 
found at the tail end of each installment. 
But first some basic sources must be 
mentioned. 

It is a truism that the final source of 
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•nformation about UNIX is UNIX it- 
f And this, of course, requires that 
‘ [, av e a source license. And to get a 
'mirce license, you must sign in blood 
t lnl you will not divulge the source code 
of UNIX in any way, shape or form. So, 

• preparing this article, we have stayed 
clear of looking at source code. But 
Ihere are times when you want to do so, 
not so much to find out how some fea- 
ture evolved as to see how it really 

WOrkS. g mmim I " ~ 

The UNIX I 

Manuals are a I JT 


These papers, and others written at 
Research, established an interesting tra- 
dition, so counter to mainstream com- 
puterdom: You write the program, you 
write the documentation. In almost ev- 
ery case, the authors of the program are 
the authors of the paper describing its 
details. And in almost every case, 
acknowledgement is made to those who 


prime 


source of 


information about Aj JF/' 

UNIX. Now, it’s 

trendy to deride vr Fm/ °\$7 
these manuals, IfiO 
but for their 

intended purpose 

and audience they | ^ 

are for the most I J p 

part examples of B ' , J 

good technical (fj 

writing. The Pro- jff 

grammer ’s Ma n - jj jj - ' / 

ualoT User’s Man- JjjMj ,!/ WV . 
ual, as it is I a 

variously called 

(more colloquially TtIsK? 

known as “Vol- ' 

ume 1”), summa- 

rizes in a standard \/ t 

format each com- / 

and many special 

files (in the techni- / 

cal sense), as well — 

as system file for- 't , 

mats, games, mis- x •' 

cellany, and main- 
tenance informa- 'jfjf 

tion. Comparing a 

series of manuals 

of different vintages offers the student of 
UNIX evolution a good view on chang- 
ing conditions. 

Volume 2 of the Manual set is a se- 
ries of short papers ranging from notes 
on installing the system through com- 
piler reference manuals to introductory 
tutorials. These papers, too, are typical- 
ly well written but occasionally incom- 
plete. They are concise and to the point; 
some people find them obscure. But re- 
member the audience and the back- 
ground — the papers are written for the 
benefit of sophisticated computer users. 
It was always assumed that you would 
“be a wizard” or have somebody around 
to help you. Or you would go to the con- 
ferences and ask others about problems. 
A careful reading of the manuals was 
(and is) required to become a wizard, 
along with hands-on time spent using 
(and eventually modifying) the system, 
learning by doing. 



contributed significant ideas, advice or 
moral support to the project. This, of 
course, has made our work in this paper 
easier. It also speaks volumes about 
management and about programmers — 
both those programmers who write ef- 
fective summaries of their programs, 
and those who don’t condescend to. 

The UNIX manuals are sometimes 
derided for the “BUGS” section. This is 
the place where the author(s) of a pro- 
gram list its design limitations. One 
UNIX critic said of this policy: “If they 
know about the bugs, why don’t they fix 
them?” The point is that the early 
UNIX authors established the benefi- 
cial habit of documenting limits to the 
program, rather than always letting the 
end user find therp. Dennis Ritchie 
comments: “Every other manual has 
bugs sections; they just aren’t pub- 


lished.” Many of the BUGS sections 
were intended as pointers for further de- 
velopment of the programs, rather than 
as warnings to the user. Ritchie adds: 
“Our habit of trying to document bugs 
and limitations visibly was enormously 
useful to the system. As we put out each 
edition, the presence of these sections 
shamed us into fixing innumerable 
things rather than exhibiting them in 
public. I remember clearly adding or 
editing many of 
these sections, 
then saying to my- 

[ self ‘I can’t write 

this,’ and fixing 
the code instead.” 

- [Ritchie, personal 

f ' correspondence]. 

After the 
manuals, another 
important series 

k ■ ilar vein appeared 
\ \ in the Bell System 

\) Technical Jour- 

nal, July/August 
1978. This special 
issue — Part 2 of 
the July/August 
1978 issue — -is of- 
i .[! ten referred to as 

fsi “the blue book” 

\ / \ because of its blue 

)/ / # binding. (In reali- 

f-/ i ty all issues of the 

BSTJ from this 
,, ? time period are 

} blue, so the handle 

; : — __ ; s a bit mislead- 
ing.) The maga- 
zine is now called 
Bell Labs Techni- 
cal Journal and is 
doing another 
special issue on 
UNIX that should come into print 
around the same time as this article. 
Watch for it! 

Many of the technical reports from 
Research are published as Computer 
Science Technical Reports (CSTRs); 
those still in print are available from 
AT&T Bell Labs. 

Brian Kernighan has co-written 
several books containing interesting his- 
torical details. We will quote later from 
Software Tools, a book he wrote with P. 
J. Plauger, and The UNIX Program- 
ming Environment, which he wrote with 
Rob Pike. 

Finally, access to a nearly complete 
collection of back issues of ; login :, the 
journal of the USENIX Association, 
has been invaluable. 

Text processing tools 

One of the guiding lights of the 
UNIX utilities or software tools has 
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been the deeply felt conviction that text 
should be stored in as simple, as general 
a format as possible, so that any pro- 
gram can easily process it. This idea (it 
seems to have been present from the be- 
ginning) has had the widest impact pos- 
sible on UNIX in all its varieties. In re- 
cent times, however, there has been a 
regrettable tendency to move away from 
it, especially among commercial soft- 
ware developers. 

We have rather arbitrarily divided 
the software tools into text processing 
tools and program development tools. 
Remember that UNIX makes no dis- 
tinction between text files, program files 
and data files. Many of the same tech- 
niques can be applied to all three. But 
more on this later. First, an outline of 
the major tools and their development. 

An old editor made new. The stan- 
dard UNIX text editor ed has a lineage 
longer than many of us do. As early as 
1969, the first assembly language ver- 
sion of ed was in place. Although later 
rewritten in C, the editor is fundamen- 
tally the same program as used then. 
Kemighan and Plauger wrote in 1976: 

“The earliest traceable version of 
the editor presented here is TECO, writ- 
ten for the first PDP-1 timesharing sys- 
tem at MIT. It was subsequently imple- 
mented on the SDS-940 as QED, the 
“quick editor,” by L. P. Deutsch and B. 
W. Lampson (see “An online editor,” 
CACM, December, 1967). K. L. 
Thompson adapted QED for CTSS on 
the IBM 7090 at MIT, and later D. M. 
Ritchie wrote a version for the GE-635 
(now HIS-6070) at Bell Labs. 

“The latest version is ed, a simpli- 
fied form of QED for the PDP-1 1, writ- 
ten by Ritchie and Thompson. Our edi- 
tor closely resembles ed, at least in 
outward appearance.” [Software Tools, 
page 217]. This is not to say that ed is 
the same as the TECO found on today’s 
DEC computers; far from it. For one 
thing, TECO is character-oriented 
while ed is line-oriented. It seems rather 
a case of common ancestry. 

During the 1970’s, the editor went 
through countless revisions. Nearly ev- 
ery university had its own modified ver- 
sions of ed and QED; some had several 
modified versions. Jay Michlin of Bell 
Labs wrote (in IBM assembler) a QED 
for IBM’s mainframe TSO; this was re- 
leased to Universities in the mid-70’s. 
This was, in fact, one of my (Darwin) 
earlier exposures to the UNIX philoso- 
phy; around 1975, I heard about a 
“spiffy new editor” for TSO, so I or- 
dered and installed it on the TSO system 
at the University of Toronto. 

Did this wide variety of editor ver- 


sions lead to massive confusion? Not 
really. For although most of the editors 
added new commands and features, 
they seldom deleted them. The result 
was that you could — and this is still 
true — learn a basic set of ed commands 
and special characters usable on every 
version. 

Today the Seventh Edition, 
4.xBSD and System III/V versions of 
ed are all sufficiently similar that one 
can move freely amongst them with 
only minor inconvenience. (Berkeley 
UNIX includes both the standard ed, 



The earliest 
UNIX formatter 
known to man 
\srotf. 



and a different editor called ex and edit. 
This editor, which has common code 
with vi, has similarities to the standard 
editor, but is not close enough that one 
can freely move between it and normal 
UNIX ed.) The manual pages for every 
current version of ed are all recogniz- 
ably derived from, say, the Sixth Edi- 
tion document. System III/V extends 
the ‘u’ (undo) command, but most of the 
other commands are constant. If you’ve 
used ed, you’ve used an editor with a 
long history, and probably a long 
future. 

roff. Having a good text editor is 
only half the text-processing battle. 
Having entered your text, you still must 
format it neatly for presentation. That’s 
the function of a text formatting pro- 
gram. The earliest UNIX formatter 
known to man is roff, a line-command 
formatter. Like ed, roff is part of a large 


and diverse family, one that includes the 
runoff package found on Digital Equip- 
ment computers (the latest release is 
called DSR, for DEC Standard Run- 
off). The earliest Runoff program is at- 
tributed by Kemighan and Plauger to J. 
Saltzer, who wrote it for CTSS. Runoff 
also is an ancestor of the Script pro- 
grams available on IBM mainframe sys- 
tems; that descent would be equally in- 
teresting for IBMers to trace (no doubt 
we’ll get letters from those with infor- 
mation to SHARE with us). 

roff was written by M. D. Mcllroy 
at Research. Like ed, roff was well in 
place by the First Edition of UNIX. It 
was considered static by the time of the 
Sixth Edition and obsolescent by the 
Seventh, then was dropped altogether 
from System III. 

nroff — the assembler of text. 
Computerists are never satisfied. So af- 
ter roff came ‘New Roff, or nroff, writ- 
ten by the late Joseph Ossanna, who 
throughout his career was concerned 
with improving the way text was han- 
dled. Ossanna’s nroff, as Kemighan and 
Pike relate: 

“. . .was much more ambitious 
[than roff]. Rather than trying to pro- 
vide every style of document that users 
might ever want, Ossanna made nroff 
programmable, so that many format- 
ting tasks were handled by program- 
ming in the nroff language. 

“When a small typesetter was ac- 
quired in 1973, nroff was extended to 
handle the multiple sizes and fonts and 
the richer character set that the typeset- 
ter provided. The new program was 
called troff (which by analogy to ‘en- 
roff is pronounced ‘tee-rofT). nroff and 
troff are basically the same pro- 
gram ...” with divergent processing ap- 
propriate to the differences in output 
device. [UNIX Programming Environ- 
ment, page 289]. 

They point out that troff is tremen- 
dously flexible, and indeed many com- 
puter books have been typeset using it. 
But it can be complex to use. As a result, 
most everyone uses one or another 
“macro package” — a series of pre-pro- 
grammed formatting commands — and 
optionally one of the pre-processors 
(such as eqn, tbl, and more recently re- 
fer, pic and ideal), troff was originally 
written in assembler, but was redone in 
C in 1975. Joseph Ossanna wrote both 
versions and maintained them until his 
death in 1977. 

Macro packages. The earliest mac- 
ro package to come into wide use was 
‘ms’ for “manuscript.” Written by Mike 
Lesk, the ‘ms’ macros provide a power- 
ful but easy-to-learn (by comparison 
with bare nroff) approach to document 
formatting. The ‘ms’ macros were dis- 
tributed with the Sixth and Seventh Edi- 
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, UNIX and most subsequent re- 
I sCS , The package was modified at 
Berkeley, which also gave us the ‘me’ 
macro package. 

The USG versions of UNIX in- 
clude a macro package called ‘mm’ for 
•‘memorandum macros.” These do 
most of the same things as ‘ms’, in 
slightly different ways, with the addi- 
tion of numbered lists and a few other 
bells and whistles, but are about half 
again as big as ‘ms’. The startup time 
with ‘mm’ is such that USG in 1979 had 
to resort to a compacted form of the 
macro packages; this found its way into 
System III. 

There are two versions of the ‘man’ 
macro package used to format the man- 
ual pages in Volume 1 of the UNIX 
Manual Set. One was used up to V6, and 
the other from V7 on. If you see a man- 
ual page beginning with ‘.th’ instead of 
‘.TH’, it’s from V6. System III has an 
(undocumented) command manevt, and 
4. 1 BSD has trman, to convert manual 
pages from the old to the new format. 

There is also the ‘mv’ macro set for 
producing viewgraph or slide presenta- 
tions. This is a USG product, and ver- 
sions of the USG manuals from PWB 1 
up to just before System III carried the 
now-famous line: “The PWB/UXIX 
document entitled PWB/UNIX View 
Graph and Slide Macros is not yet avail- 
able. Viewgraph Macros is in prepara- 
tion.” System III manuals appeared 
with scarcely a mention of ‘mv’, and it 
was first documented in the System V 
manuals. 

tbl, eqn. One view of troff is as an 
assembler language for text processing. 
If this be true, then eqn and tbl are the 
high-level compilers that go with it. 

Mathematics has always been an 
inconvenience to traditional typeset- 
ting. This observation led Brian 
Kemighan and Lorinda Cherry to de- 
velop eqn for UNIX, and would later 
lead Donald Knuth to write his TeX 
typesetting package with math capabili- 
ties built in. 

The eqn program reads an entire 
nroff/troff input file and passes it un- 
changed except for “equation specifica- 
tions” delimited by .EQ and .EN re- 
quests. Material inside these requests is 
used to construct equations of consider- 
able complexity from simple input. En- 
glish words such as ‘sum’, ‘x sub i\ and 
‘infinity’ produce the expected results (a 
large Sigma, x with a subscript i, and the 
infinity symbol respectively). In most 
cases no typesetter wizardry is required. 
A list of some 40 extra character defini- 
tions lives in the file /usr/pub/ 
eqnchar; these can be copied, extended, 
or altered by the knowledgeable user. 

eqn was written by Brian 
Kernighan and Lorinda Cherry. The 


“new graphic symbols” in /usr/pub/ 
ascii are the work of Carmela 
L’Honmedieu (formerly Scrocca) at 
Bell Labs. The first public write-up of 
eqn appears to be a paper by Kemighan 
and Cherry in the CACM, March 1975. 
The software was included in the Sixth 
Edition UNIX. 

Like eqn, tbl is a preprocessor that 
passes over a formatter input file look- 
ing for special requests (here .TS and 
.TE). The material between these re- 
quests is expected to be a series of spe- 
cial commands to tbl and some tabular 



troff is tremen- 
dously flexible, 
but it can be 
complex to use. 



data. To greatly oversimplify how this 
program works, tbl replaces tab charac- 
ters with explicit horizontal and vertical 
moves to make the rows and columns in 
the table align exactly under control of 
the table specification. It is invaluable 
for putting tabular material of any kind 
into documents. 

Mike Lesk wrote tbl at Research; 
the idea for it came from an earlier table 
formatting program by J. F. Gimpel. tbl 
first appeared outside the Labs with the 
V6 release of UNIX. It appeared in its 
present form on the “Phototypesetter 
Version 7” (interim V7) PWB tape and 
in Seventh Edition UNIX distributions, 
and in all systems since then. 

Lesk also wrote refer, a bibliogra- 
phy citation and reference package, 
which first appeared in V7. 

Typesetter-independent troff. 
New! Improved! Yet again! That’s right. 


troff is infinitely perfectable. In 1979, 
Brian Kemighan at Research set out to 
modify troff. Rather than rewrite it 
completely and be incompatible with 
the tens or hundreds of thousands of 
documents in existence, he chose to 
“clean up” troff. The task turned out to 
be rather more akin to cleaning the Au- 
gean Stables than he had imagined, but 
resolve did not desert Kernighan. Final- 
ly he emerged with a tape for the Device 
Independent troff, along with revised 
tbl/eqn and two new preprocessors, pic 
and Chris Van Wyk’s ideal, pic (as the 
name implies) draws pictures. It is use- 
ful for drawing flowchart-like drawings, 
but there is much more to it than that, 
ideal also draws pictures, but is some- 
what more mathematical in usage than 
is pic. 

This set of products forms the basis 
of the commercialised “Documentor’s 
Workbench” package from AT&T. 
And work continues, of course. 
Kemighan has been working on clean- 
ing up the appearance of eqn output. 
Recently, Kemighan and John Bentley 
have written grap, a graph plotting pre- 
processor for pic. The program will not 
likely be released for some time, but a 
report on it should appear shortly as a 
CSTR. 

Of Mice and Blits. Tired of typing 
at a dull, boring 24 x 80 screen, Rob 
Pike and Bart Locanthi had a better 
idea. Integrating the ideas of the Alto 
project and related work at Xerox 
PARC (Palo Alto Research Centre) 
with the UNIX approach to things, they 
built a special terminal called the Blit 
(not an acronym) with a 68000 proces- 
sor, high-resolution screen, good key- 
board, a mouse, and software shared be- 
tween the host and the terminal. Their 
particular combination of these ingredi- 
ents makes possible a form of interac- 
tion with the computer that is not yet 
understood by 99% of the people work- 
ing in the computer field. Pike, in addi- 
tion to being a radical advocate of the 
UNIX approach to software develop- 
ment, is quite visionary. Some of his 
work has been described at recent 
USENIX conferences. And most of the 
audience didn’t seem to grasp the essen- 
tials of what he was saying. In 1969 
Steve Munro described to me somebody 
with what he called “Unit Record Men- 
tality,” meaning somebody firmly at- 
tached to. the (even then obsolescent) 
card punches, readers, and impact 
printers. By analogy with this, I could 
describe those who don’t dig Pike’s ter- 
minal interaction as being possessed of a 
“preinteractive mentality.” But that 
would be premature. 

True Blits are available only inside 
Bell Labs. The Blit has been commer- 
cialised by AT&T/Teletype, and is sold 
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(with a different processor) as the 5620 

terminal. 

Style, Diction, Writer’s Work- 
bench. One of us (Darwin) has long 
been interested in the computerised pro- 
cessing of text, a term I take to mean 
more than is commonly included as 
“word processing.” So I was quite inter- 
ested to read a paper by L. E. McMa- 
hon, Lorinda L. Cherry and R. Morris 
entitled “Statistical Text Processing” in 
the 1978 special BSTJ issue on UNIX. I 
would later use several of the techniques 
mentioned in the paper. 

At the end of the paper, Ms. Cherry 
describes a program parts for finding 
parts of speech in English text. This was 
written to be the first pass of a system to 
add inflection to the speak program 
written by Doug Mcllroy, but the per- 
son doing the stress part left the compa- 
ny. Ms. Cherry wasn’t interested in the 
stress assignment, so she documented 
the work done so far and went on to oth- 
er things. 

In the spring of 1979, W. Vester- 
man of Rutgers approached Doug 
Mcllroy at Research about computeriz- 
ing one of the techniques Vesterman 
used in teaching writing. The students 
had to count surface features in their 
text and in a sample of text written by a 
professional writer. That summer, Ms. 
Cherry expanded parts considerably, 
and added the code that turned it into 
style, a program to analyse the readabil- 
ity and other characteristics of a textual 
document. She also developed diction to 
check for awkward word uses, overused 
words, and other problems facing every- 
one who composes text for others to 
read. In addition, she modified deroff to 
find the real text in a document. 
Vesterman consulted on this work. 

And when the 4.1 BSD release of 
the system came out, I was pleasantly 
surprised to see that style and diction 
were present. Bell Labs has a policy of 
sometimes releasing software to educa- 
tional institutions; this probably ex- 
plains the release at Berkeley. 

While this was going on, the Hu- 
man Factors group at Piscataway (now 
at Summit) was getting interested in 
automating document review, and Nina 
Macdonald of that group called Ms. 
Cherry about using parts. She had 
worked at Murray Hill in a Linguistics 
group and was familiar with the pro- 
gram. Ms. Macdonald took style and 
diction, and WWB evolved from there. 
Writer’s Workbench (WWB) consists of 
style, diction and a dozen or so related 
programs for finding problems in writ- 
ten work. The “chattiness” level of the 
programs is set for the beginning user, 


but can easily be adjusted by the ad- 
vanced user. The ideas for this work 
came from the Piscataway group, the 
Murray Hill group, and from Colorado 
State University, where extensive use of 
the Writer’s Workbench (described at 
USENIX, Toronto, July 1983) current- 
ly puts several thousand undergradu- 
ates on WWB each year. The use of 
WWB is perceived to improve signifi- 
cantly the students’ writing skills. 

Many writers will be thankful to all 
who contributed, because these pro- 



awk is not at 
all awkward; 
it is a great 
simplification. 



grams have proven themselves useful 
many times over. If buying a 4.1 or 
4.2BSD system, insist on style and dic- 
tion. If you get a System V UNIX, con- 
sider getting the WWB add-on if you’ll 
be doing any document preparation. 
Writer’s Workbench is one product that 
should survive and prosper as UNIX 
continues to evolve. The next major re- 
lease of WWB (3.0) is scheduled for the 
spring of 1985. 

Compilers, languages, tools 

What is an operating system with- 
out languages and utilities? Despite its 
limited support for Fortran, UNIX has 
always been known for the diversity of 
languages and tools that it provides. 
Some of these are well known; others 
are perhaps less well known than they 
ought to be. 

The C programming language. The 


early evolution of the C language has 
been described elsewhere (see the re- 
print of Dennis Ritchie’s paper in Mi- 
crosystems, October, 1984. Dennis is 
rather modest, and doesn’t tell you that 
the UNIX world has named the C com- 
piler described there “the Ritchie com- 
piler” (to distinguish it from other C 
translators). As we pick up the threads 
of the story, Fifth Edition UNIX has 
been in the field for some time. It is 
May, 1975, and the new improved Sixth 
Edition is about to be released. Ritchie 
has added some support for a new data- 
type, ‘long integers’ referred to with the 
keyword ‘long’, but this will not be 
documented. Not all the runtime sup- 
port has been installed, and the tape 
goes out without it. Later Ken Thomp- 
son will announce that the support for 
‘longs’, limited though it be, was there 
all along in V6. There is no support for 
‘short integers’, or ‘shorts’. 

There follows a succession of re- 
leases of the C compiler. The PWB 1.0 
release of UNIX, the first outside the 
Labs of a non-Research UNIX from 
Bell, goes out in 1977. And shortly 
thereafter a special-release tape known 
only as “Phototypesetter Version 7” in- 
cludes a new release of troff as well as 
the C compiler, assembler, loader, 
archiver and bits of the C library includ- 
ing he first release of ‘stdio’. These com- 
pilers seem to be from about the same 
vintage. Both compilers support anoth- 
er new datatype modifier, ‘unsigned’, 
which causes all bits of an integer to be 
treated as magnitude (on the PDP-11, 
for example, signed ints are from -32768 
to +32767, while unsigned ints are 
from 0 to 65535). These compilers add 
typedefs, which allow you to generate 
your own names for existing datatypes, 
for a degree of independence from the 
machine datatypes. One of these com- 
pilers is somewhat buggy — the concept 
of ‘cast’ is in the code but doesn’t work 
properly. Bit fields exist but are buggy; 
this is documented. The Phototypeset- 
ter Version 7 was primarily a release of 
troff; the compiler was included be- 
cause it was necessary for troff (very 
convenient, since Research wanted to 
get the latest C out into the field 
anyway). 

Finally the Seventh Edition of 
UNIX is released. Of course, it has an- 
other C compiler. This one, for the most 
part, is a “shaken down” version of the 
“Phototypesetter C” compiler. It is a lot 
more solid, although bit fields are still 
broken and now the bug is not docu- 
mented. And there is a special kludge 
for uucp whereby casting an expression 
involving a character pointer to type un- 
signed treats the character referenced 
by the pointer as an unsigned character, 
a concept not yet in the compiler or lan- 
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Put the power of the IBM PC into 
your OEM system with the new I-Bus 
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guage. This will be quietly withdrawn 
later. The compiler has a bug in that it 
treats the right, not the left, side of an 
assignment as the value of the assign- 
ment expression. The V7 stdio exploits 
these two bugs, thereby making it 
nonportable. The semantics of casts will 
remain unsettled until slightly after V7. 

AJong with the Ritchie C compiler, 
V7 of UNIX includes the first release 
outside Bell Labs of a second C compil- 
er, bearing the impressive name of “Por- 
table C Compiler.” Written by S. C. 
Johnson, this compiler has been in de- 
velopment since 1975 and uses the pro- 
gram development tool yacc but not lex. 
(A part of the yacc grammar for this 
compiler was published in the C manual 
with PWB in 1977.) 

The portable C compiler turns out 
to be not as portable as desired, so a sec- 
ond version is developed over the next 
few years, called pcc2. At the ACM Na- 
tional Conference in 1983, Steve John- 
son describes pcc2 in some detail, and 
shows an example of its portability. Of 
the many “back ends” for it, one com- 
piles a C language algorithm into the 
commands necessary to drive a VLSI 
fabrication process. So your program (if 
you work in the right part of the Labs!) 
can be compiled into a custom micro- 
processor, optimised to execute your 
program and nothing else! That sure 
outclasses the EPROM versions of Intel 
and Motorola microprocessors. pcc2 
has only recently been released; it is the 
C compiler for the Software Generation 
System. 

One immediate beneficiary of the 
two-pass nature of pcc was the Fortran 
compiler, to which we will return short- 
ly. But a second major fallout from pcc 
is a program called lint, which does par- 
tial compilation of C programs with 
much greater error checking. Like pcc, 
lint first appears with V7. We continue 
to recommend the use of lint to provide 
some reassurance of program correct- 
ness and portability. 

Berkeley has taken the C language 
in some new directions. They have re- 
laxed some restrictions on compiled 
programs. Most notably, variables can 
be almost any length and need not be 
unique in the first seven or eight charac- 
ters. While this sounds handy, it is a ma- 
jor annoyance to the rest of the world, 
which has to change programs written 
with such “features” in order even to 
compile them. Berkeley programmers 
also tend to rely to an unprecedented ex- 
tent on the ‘asm’ keyword, which allows 
you to interpolate assembler language 
code into the middle of the C program, 
‘asm’ buys an increase in microeffi- 


ciency, but only at the cost of a tremen- 
dous loss of portability. To preserve 
portability, the programmer should use 
#indef to include in the source code 
both an assembler version and a porta- 
ble C version. But the latter is often 
omitted A fine example was shown in a 
talk by Mike Tilson of Toronto’s Hu- 
man Computing Resources at the San 
Diego USENIX Conference in January, 
1983. (See Tilson’s article on page 84, 
based on his talk). Here’s the code: 


to = bp->b_pt r ; 
asm("movc3 r8 t (r11),(r7)"); 
bp->b_p t r += put; 

What it does is left as an exercise to 
the reader. The writer of this code left 
no clues as to how his mayhem works. 
As Mike says: “The variable ‘to’ is one 
of the registers used in this VAX assem- 
bly instruction. You guess which.” Oh, 
we almost forgot. The three lines above, 
are Copyright ©1980 by the Regents of 
the University of California. 

Meanwhile, back at Research, B. 
Stroustrup has been busy adding “class- 
es” to C. Classes (nothing to do with go- 
ing back to school) are the interesting 
part of Simula 67. They provide for or- 
derly interchange of data between mod- 
ules, with po possibility of hidden de- 
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pendencies. A class consists of data 
(normally inaccessible from outside the 
class) and functions which are normally 
accessible from outside but which may 
be declared as inaccessible. One typical- 
ly defines a class and publishes the 
names of the accessible functions. Func- 
tions outside the class cannot reference 
the data within that class except by call- 
ing the class’s publicly accessible func- 
tions. This enforces modularity by hid- 
ing the details of a particular class’s 
internals from other routines. Classes 
can be nested, of course, so you can de- 
velop such things as queues and stacks 
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of objects. The C compiler encompass- 
ing all recent developments, including 
classes, declaration of function parame- 
ters for type checking, and other recent 
developments is given the name 
“C++”, C programmers will recog- 
nize the pun; for others, it simply means 
“an incremented (augmented?) form of 
the C language, which retains the value 
of the old language.” C+ + has been in 
use for some time within the Labs, and 
may be cleared for external release soon. 

In addition to the compiler, one 
needs a series of library routines to do 
Input/Output and some ‘extra-linguis- 
tic’ operations such as setexit( ) in V6 or 
longjmp( ) in V7. The first ‘portable C 
library’ was written by Mike Lesk and 
was implemented on the PDP-li, the 
IBM 370, and the Honeywell 6000 with 
the GCOS operating system. It set the 
style for subsequent development, and 
in Version 7 there was a “new portable 
I/O library” written by Ritchie. This 
has become known as ‘stdio’ (pro- 
nounced “stuh-DYE-oh”) for the name 
of its header file, and is the I/O library 
distributed with all real UNIXes today. 

The current C compilers for the 
PDP-11 continue to derive from the 
Ritchie versions, pcc for the PDP-11 
never worked as well as the Ritchie 
compiler. Most other machines use pcc- 
based C translators, since the Ritchie 
compiler only works for PDP-1 Is. 
Many systems integrators wait earnest- 
ly for the release of pcc2, since porting 
pcc takes a non-trivial amount of work. 

The future of the C language is not 
primarily in the hands of people like 
Dennis Ritchie and Steve Johnson and 
B. Stroustrup. Rather, it is in the hands 
of the ANSI C Language standards 
committee. But in the final sense, it is in 
the hands of programmers everywhere. 
This is partly because ANSI is a demo- 
cratic agency, and any member of the 
committee has as much voice as a Den- 
nis Ritchie or a Steve Johnson. It is also 
because C is a powerful language, and 
like all powerful tools it can be used or 
abused. This is not the place for a tutori- 
al on C style, but the interested reader 
can refer to the article by Tilson cited 
above. Good use of C leads to rapid de- 
velopment of maintainable code; poor 
use of the language leads to code that 
looks like it was written in assembler. 
As we have seen, in a few cases it has 
been. 

Fortran. Since UNIX comes from a 
Computer Science research back- 
ground, it is perhaps natural that For- 
tran, that octogenarian, reptilian but 
ubiquitous language should be the ob- 
ject of some disdain among UNIX- 
ophiles. Indeed, the V6 how to get start- 
ed document says that “no debugger is 
much help for Fortran.” And the Sixth 


Edition manual set included the C Ref- 
erence and the C Tutorial, but nothing 
on Fortran beyond the manual page for 
fc (1), a compiler for a slight variant of 
the ANSI Fortran-66 standard. 

fc produced executable programs 
that used threaded code and floating- 
point instructions heavily; thus it ran 
slowly on machines without a floating- 
point processor, on which floating point 
had to be interpreted by the UNIX' 
kernel. 

The prime mover behind the next 
Fortran compiler was Stuart Feldman, 
who had been interested in compilers 
for some time. In 1976 he released a 
CSTR on “Fortlex — A General Pur- 


pose Lexical Analyzer for Fortran.” 
This program reads a Fortran program 
and breaks it up into lexical tokens of 
the appropriate type, fortlex was used in 
the construction of various Fortran pro- 
gramming aids, such as a program to 
change all double-precision variables, 
functions and library calls to single pre- 
cision. The paper also includes the yacc 
grammar for a Fortran scanner to be 
used with fortlex — not a complete com- 
piler, but possibly a basis for one. 

And the Fortran weakness of 
UNIX was remedied with a vengeance 
for the Seventh Edition. A compiler for 
the full ANSI Fortran-77 standard was 
included, the first implementation of 
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the 1977 standard on any system any- 
where, along with a paper detailing its 
use and implementation. One back end 
of this compiler was the same back end 
as the Portable C Compiler, so that it 
would be easy to adapt to new 
computers. 

Although the Fortran compiler is 
part of all standard UNIX systems, 
most suppliers of 68000-based UNIX 
boxes do not include f77. Whether this 
is so they can charge extra for it, or be- 
cause they couldn’t figure out how to 
port it, is unclear. But commercial im- 
plementations are available for most mi- 
cro-based UNIXes. 

yacc+lex. One of the major tools 
used in compiler development is the 
yacc (yet another compiler compiler) 
program by Steve Johnson. When this 
program was developed in the early 
1970’s, compiler generators were being 
generated by many universities and oth- 
er research institutes. As Kernighan 
and Pike remark, Johnson’s choice of 
name for his program is ironic in that 
his has endured while most of the others 
have now been retired. 

yacc reads in the specification of 
the syntax of a language and generates a 
program which parses that language. 
Note that this is not limited to “pro- 
gramming languages,” but can be ap- 
plied to any input that is structured. 
Many applications of yacc are men- 
tioned in the yacc manual, yacc is also 
part of the nrws nroff-to-WordStar pro- 
gram used to translate some articles for 
Microsystems. The yacc manual men- 
tions “compilers for C, APL, Pascal, 
RATFOR, etc. .... a phototypesetter 
system, several desk calculators, a docu- 
ment retrieval system, and a Fortran de- 
bugging system” as programs that have 
been written in yacc. More recently, Co- 
bol and Ada compilers have been con- 
structed outside of the Labs. 

But syntax analysis is only one part 
of compilation. Another part is lexical 
analysis, or scanning of the input to find 
certain kinds of tokens. For this, too, 
UNIX has an answer. The lex program 
by Mike Lesk and E. Schmidt provides 
this function. Since it is part of the 
UNIX tradition, of course, lex uses 
many of the same conventions. In par- 
ticular, lex uses a variation on the nota- 
tion for “regular expressions” as is used 
in the editors and elsewhere to describe 
the patterns to be looked for. If you’ve 
mastered commands like /[hH]e/ in the 
editor, you already know most of what 
you need to know to construct expres- 
sions for lex. And of course it works 
with yacc. The naming conventions of 
these two programs are such that both 


! 


can be loaded together to form a work- 
ing unit. Indeed many programs consist 
of yacc and lex outputs compiled and 
loaded together. 

yacc was present in Version 6 (the 
manual page is dated late 1974); lex first 
appeared outside the labs in the PWB 
1.0 release. 

make. It’s hard to imagine UNIX 
without the make utility, make is so tak- 
en for granted these days that the distri- 
bution of software in source form with- 
out a makefile is an event worthy of 
attention and inquiry. But there was a 
time when the name of the file with the 
instructions to build a system were cho- 
sen at random from the names ‘build’, 
‘re’, ‘run’, ‘runfile,’ and others. And 
these were shell files which built the en- 
tire component. 

make builds a program or compo- 
nent from individual pieces, and recom- 
piles only the minimum needed to re- 
build it as changes are made. The 
edit-make-debug cycle is well known to 
UNIX programmers. Since the topic 
has been treated in detail in “The UNIX 
File” by one of us, we will not expand on 
it here. Suffice to say that make was 
written by Stuart Feldman at Research 
and first appeared outside Bell in the 
PWB release of the system. The barbar- 
ities of the Source Code Control System 
and the incompatability of this with 
most of UNIX including make led not 
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to the correction of SCCS, but to an “en- 
hanced” make that appears publicly in 
System III and System V. 

ratfor, efl. Brian Kernighan at Re- 
search realized that Fortran would not 
go away, so he did something about it. 
He fixed it. He fixed it by adding the 
control structures of C and the defini- 
tion and inclusion capabilities of the C 
preprocessor. The converter which 
takes in ‘rationalised Fortran’ and pro- 
duces ugly conventional Fortran he 
called ‘Ratfor.’ Several versions were 
written; one in Fortran for bootstrap- 
ping onto other systems, another with 
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yacc and lex as mentioned above. The 
meaning of Ratfor is best told in the 
book Software Tools co-written with P. 
J. Plauger. The source code for the pro- 
grams in the book was made available 
on magnetic tape by Addison-Wesley in 
a move that was very foresighted for 
1976. This led to the formation of the 
Software Tools User Group at Law- 
rence Livermore Labs in Berkeley; this 
group is still active and co-sponsors 
meetings with USENIX. 

The Software Tools book would lat- 
er be redone in Pascal (see ‘Pascal’ sec- 
tion). There are no plans announced for 
doing a “Software Tools in C” book; 
most of what you need is in Kernighan 
& Pike’s book, The UNIX Programming 
Environment. 

After the Fortran-77 compiler, Stu- 
art Feldman turned his attention to For- 
tran extensions, and produced the efl 
language. This combines the control 
structures of Ratfor (which in turn de- 
rive from C) with the data-structuring 
capabilities of C, including the aggre- 
gates to group related data items, 
analagous to Pascal’s record capability, 
efl is included in System III and some 


4.xBSD systems. Some microcomputer 
ports (i.e., UniSoft) include efl even 
though they don’t have a Fortran 
compiler. 

awk. Aho, Weinberger, and 
Kernighan. The initials of three authors 
put together in the most pronouncable 
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way. That’s what they did when they 
couldn’t think of a more imaginative 
name for a wonderful program they’d 
devised. A is A1 Aho, of compiler book 
fame. W is Peter Weinberger of Re- 
search. And K is Brian Kernighan, just 
mentioned for his work on Ratfor. 
(Kernighan and Pike’s book remarks 
that “Naming a language after its au- 
thors also shows a certain poverty of 


imagination” [page 131]). awk is not at 
all awkward; it is a great simplification. 
You can think of it as the combination 
of most of the best ideas of the other 
tools all rolled into one. We use it all the 
time. For some examples, see the review 
of “Leverage” in the August 1984 issue. 
You can enter the awk commands from 
the command line if they are simple 
enough, so that 

awk -F: '(print S 1 j ' /etc/passwd 

is a complete program to print the 
names of all the accounts shown in 
/etc/passwd, the standard place for the 
names of all accounts on the system. By 
all means learn about awk and use it. It 
will make life far less awkward. 

awk was first described in Software 
Practice and Experience in July 1978, 
and first distributed with Seventh Edi- 
tion UNIX. 

Pascal. Pascal did not catch on at 
Research. In 1981, Brian Kernighan did 
a paper published as a CSTR entitled 
“Why Pascal is not my Favorite Pro- 
gramming Language.” The note was 
not based solely on introspection, for he 
and P. J. Plauger had just converted 
their book Software Tools into Software 
Tools in Pascal , including re-coding all 
the programs in Pascal. In the process 
they came to regard Pascal as their not 
favorite language. 

Berkeley UNIX has included Pas- 
cal for a long time. Ken Thompson 
wrote the first version of Berkeley Pas- 
cal at Berkeley while working there as 
Visiting Mackay Lecturer in Computer 
Science in 1975/76. He spent the aca- 
demic year at UC Berkeley, and taught 
several courses in Computer Science. 
He recalls: “When I arrived, the CS de- 
partment shared an 11/45 with Statis- 
tics. It was 50-50 UNIX and RSTS. The 
first advance was an 1 1/70 dedicated to 
teaching. I put my first 155 [operating 
systems] course on it. Between the first 
and second quarters I wrote the Berke- 
ley Pascal and talked Bob Fabry into us- 
ing it on his 153 [data structures] class. 
It has been used for that ever since. By 
the time I left, there were several (2 or 3) 
1 l/70s in the computing center provid- 
ing UNIX service. CS had the 1 1/70 for 
teaching; they had almost completely 
taken over the stat 11/45 and there was 
a research 1 1/40” in an AI lab [Thomp- 
son, personal correspondence]. 

Pascal compilers can be had for 
most 68000-based UNIX boxes. These 
are available from commercial software 
firms and OEMs — see the annual 
UNIX software directory in the April 
Microsystems. 

The S System. Finally, we cannot 
overlook an interesting “application” 
language from Research. S: An Interac- 
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tive Environment for Data Analysis and 
Graphics, the title of the 1984 book by 
Richard A. Becker and John M. Cham- 
bers, puts it succinctly. We put it as fol- 
lows: The S package is to conventional 
mainframe statistics packages as the 
UNIX shell is to batch Job Control lan- 
guages. It provides interactive explor- 
atory statistics, interactive and offline 
graphics, plus data modelling and time 
series manipulation. 

The S language was developed at 
Research. The first public release was in 
1981, which was for V7 and 32V. There 


were several interim releases; the next 
major release was in early 1984. This re- 
lease was accompanied by a change in li- 
censing and an order-of-magnitude cost 
increase for non-educational users, as 
part of the swing to the commercialisa- 
tion of UNIX by AT&T Technologies. 

The only remotely similar products 
that I know of in all of computerdom 
are APL and Speakeasy. APL was first 
implemented on IBM systems; at least 
one version for UNIX was developed. 
Speakeasy similarly arose on IBM hard- 
ware; a subset version called SpeakeC 
was developed at Purdue. Speakeasy 
was developed some time before S, but 
in quite different circles of influence. 
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large to fit into memory. 


Reviewers everywhere have 
praised BDS C for its elegant 
operation and optimal use of 
CP/M resources. Above all, BDS C 
has been hailed for it’s remarkable 
speed. 

BYTE Magazine placed BDS 
C ahead of all other 8080/Z80 C 
compilers tested for fastest 
object-code execution with all 
available speed-up options in use. 
In addition, BDS C’s speed of 
compilation was almost twice as 


• A 120-function library written in both 
C and assembly language with full 
source code. 

Plus . . . 

• A thorough, easy-to-reacJ, 181 -page 
user's manual complete with 
tutorials, hints, error messages and 
an easy-to-use index — it’s the 
perfect manual for the beginner and 
the seasoned professional. 


» An attractive selection of sample 
programs, including MODEM- 
compatible telecommunications, 
CP/M system utilities, games and 
more. 

* A nationwide BDS C User's Group 
($10 membership fee — application 
included with package) that offers a 
newsletter, BDS C updates and 
access to public domain C utilities. 


fast as its closet competitor 
(benchmark for this test was the 
Sieve of Eratosthenes). 

“I recommend both the 
langnag p and the implementation 
by BDS very highly." 

Tim Pugh. Jr. 
in Infowodd 
"Performance: Excellent 
Documentation: Excellent 
Ease of Use: Excellent " 

Info World 

Software Report Card 
“. . . a superior buy . . 

Van Court Hare 

in LifeUnesfThe Software 

Magazine 


Don’t waste another minute on 
a slow language processor. Order 
your BDS C Compiler today! 

Complete Package (two 8' SSDO disks, 
181-page manual): $150 
Free shipping on prepaid orders inside 
USA 

VISA/MC, COD’s, rush orders accepted. 
Call for information on other disk 
formats. 

BDS C is designed for use with CP/M-80 
operating systems, version 22. or higher. It is 
not currently available for CP/M-86 or MS- 
DOS. 


BD Software, Inc. 

P.O. Box 2368 
Cambridge, MA 02238 
(617) 576-3828 
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There appears to be no cross-poJination 
between the two, although many of the 
ideas are similar. S uses yacc to interpret 
its grammar; the yacc specification ap- 
peared in a CA CM paper in 1984. 

Interlude 

The Computer Science Research'' 
Group, Centre 127, at Bell Labs has had 
an impact on computerdom far out of 
proportion to the number of people 
working there. A small group of talent- 
ed people, started in motion by Ken 
Thompson and Dennis Ritchie with the 
original design of UNIX and Rudd 
Canaday’s file system design, aided and 
abetted by those mentioned here and 
others, developed Research UNIX and 
its related tools. Many reasons are given 
for the success of UNIX, but one we’d 
like to add is the consistency of the sys- 
tem in all its facets. As a single example, 
the syntax used for pattern matching (a 
notation for the abstract concept of 
Regular Expressions) allows you to eas- 
ily develop tremendous skill in pattern 
matching. This skill, once learned, can 
be applied in the editor to find text, in 
awk to find records to be acted on, in lex 
to specify partitioning of an input, (with 
simple modification) in shell commands 
to match filenames, and in a dozen or so 
contexts. This kind of consistency is a 
rare treat in any computer system; the 
extent to which it permeates UNIX is 
exemplary. 

This concludes the first installment 
of our history of the natural creation of 
the UNIX timesharing system. The 
next installment will cover the relation- 
ships among many, many different ver- 
sions of UNIX, most of them never re- 
leased or publicised outside of Bell Labs 
and the telephone companies. There 
will be a “family tree” diagram illustrat- 
ing the descent of UNIX. We hope you 
enjoyed this edition, and that you’ll be 
looking forward to the next. 
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